meeting time
Early Impacts of M365 Copilot
Dillon, Eleanor Wiske, Jaffe, Sonia, Peng, Sida, Cambon, Alexia
New generative AI tools have been developing rapidly and are now widely used, including by workers in doing their jobs. Microsoft worked with firms across industries to run a large field experiment to measure how access to one of these tools changes work patterns. The experiment ran during the early rollout of Microsoft's M365 Copilot (Copilot), a multi-part generative AI tool that integrates generative AI into components of Microsoft's M365 suite (including Word, PowerPoint, Outlook, and Teams). M365 Copilot is designed as a general purpose tool to help workers digest information by summarizing emails, meetings, or documents, create new content by drafting emails, documents, or slide decks, and retrieve information either from the web or across any company email, chat, or document to which the worker has access. We worked with firms to randomize access to Copilot and got permission to use several months of anonymized metadata on workers' behaviors in Outlook, Teams, and Office, both before and after access to Copilot.
Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings
Ceriani, Paolo Maria, Zanella, Giacomo
In recent years, unbiased Markov Chain Monte Carlo via couplings (UMCMC) has emerged as a promising framework to remove bias from MCMC estimates, thus potentially allowing for early stopping, simplifying the convergence diagnostic process and facilitating parallelization (Glynn and Rhee, 2014; Jacob et al., 2020). In UMCMC, coupled chains are run for a random number of iterations (at least up to coalescence) and their values are combined to produce unbiased estimates. A natural question that arises is whether this class of estimates incurs a greater computational cost than conventional MCMC based on simple ergodic averages and to quantify this potential difference. Framing the question differently, one may ask whether it is possible to devise UMCMC methods with computational cost matching top performing MCMCs, while enjoying the above mentioned benefits. On a different line of research, various works showed how carefully designed blocked Gibbs Samplers (BGSs), i.e. Gibbs sampling schemes that update entire blocks of coordinates jointly, can achieve state-of-the-art performances for sampling from the posterior distributions of various challenging high-dimensional Bayesian models, such as non-nested models with crossed dependencies (Papaspiliopoulos et al., 2019, 2023). In particular, BGSs achieve linear computational costs in the number of parameters and observations in asymptotic regimes where both diverge to infinity.
High-Precision, Fair University Course Scheduling During a Pandemic
Petering, Matthew E. H., Khamechian, Mohammad
Scheduling university courses is extra challenging when classroom capacities are reduced because of social distancing requirements that are implemented in response to a pandemic such as COVID-19. In this work, we propose an expanded taxonomy of course delivery modes, present an integer program, and develop a course scheduling algorithm to enable all course sections -- even the largest -- to have a significant classroom learning component during a pandemic. Our approach is fair by ensuring that a certain fraction of the instruction in every course section occurs in the classroom. Unlike previous studies, we do not allow rotating attendance and instead require simultaneous attendance in which all students in a section meet in 1-5 rooms at the same time but less often than in a normal semester. These mass meetings, which create opportunities for in-person midterm exams and group activities, are scheduled at high precision across all days of the semester rather than a single, repeating week. A fast heuristic algorithm makes the schedule in an hour. Results: We consider the 1834 in-person course sections, 172 classrooms, and 96 days in the fall 2022 semester at [UniversityXYZ]. If average classroom capacity is reduced by 75% due to a pandemic, our approach still allows at least 25% of the instruction in every section, and more than 49% of all instruction across the entire campus, to be in the classroom. Our method also produces excellent results for regular classroom assignment. Managerial implications: An algorithm based on the principles of fairness and simultaneous attendance can significantly improve university course schedules during a pandemic and in normal times. High-precision schedules that prepare a campus for various pandemic possibilities can be created with minimal administrative effort and activated at a moment's notice before or during a semester if an outbreak occurs.
Accelerating Asynchronous Federated Learning Convergence via Opportunistic Mobile Relaying
This paper presents a study on asynchronous Federated Learning (FL) in a mobile network setting. The majority of FL algorithms assume that communication between clients and the server is always available, however, this is not the case in many real-world systems. To address this issue, the paper explores the impact of mobility on the convergence performance of asynchronous FL. By exploiting mobility, the study shows that clients can indirectly communicate with the server through another client serving as a relay, creating additional communication opportunities. This enables clients to upload local model updates sooner or receive fresher global models. We propose a new FL algorithm, called FedMobile, that incorporates opportunistic relaying and addresses key questions such as when and how to relay. We prove that FedMobile achieves a convergence rate $O(\frac{1}{\sqrt{NT}})$, where $N$ is the number of clients and $T$ is the number of communication slots, and show that the optimal design involves an interesting trade-off on the best timing of relaying. The paper also presents an extension that considers data manipulation before relaying to reduce the cost and enhance privacy. Experiment results on a synthetic dataset and two real-world datasets verify our theoretical findings.
Automatic driving path plan based on iterative and triple optimization method
This paper presents a triple optimization algorithm of two-dimensional space, driving path and driving speed, and iterates in the time dimension to obtain the local optimal solution of path and speed in the optimal driving area. Design iterative algorithm to solve the best driving path and speed within the limited conditions. The algorithm can meet the path planning needs of automatic driving vehicle in complex scenes and medium and high-speed scenes.
Coding algorithms in R for models written in Stan
On top of recommending the excellent autobiography of Stanislaw Ulam, this post is about using the software Stan, but not directly to perform inference, instead to obtain R functions to evaluate a target's probability density function and its gradient. With which, one can implement custom methods, while still benefiting from the great work of the Stan team on the "modeling language" side. As a proof of concept I have implemented a plain Hamiltonian Monte Carlo sampler for a random effect logistic regression model (taken from a course on Multilevel Models by Germรกn Rodrรญguez), a coupling of that HMC algorithm (as in "Unbiased Hamiltonian Monte Carlo with couplings", see also this very recent article on the topic of coupling HMC), and then upper bounds on the total variation distance between the chain and its limiting distribution, as in "Estimating Convergence of Markov chains with L-Lag Couplings". Basically the R script starts like a standard script that would use rstan for inference; it runs the default algorithm of Stan for a little while, then extracts some info from the "stanfit" object. With these, a pure R implementation of TV upper bounds for a naive HMC algorithm follows, that relies on functions called "stan_logtarget" and "stan_gradlogtarget" to evaluate the target log-pdf and its gradient.
Coding algorithms in R for models written in Stan
On top of recommending the excellent autobiography of Stanislaw Ulam, this post is about using the software Stan, but not directly to perform inference, instead to obtain R functions to evaluate a target's probability density function and its gradient. With which, one can implement custom methods, while still benefiting from the great work of the Stan team on the "modeling language" side. As a proof of concept I have implemented a plain Hamiltonian Monte Carlo sampler for a random effect logistic regression model (taken from a course on Multilevel Models by Germรกn Rodrรญguez), a coupling of that HMC algorithm (as in "Unbiased Hamiltonian Monte Carlo with couplings", see also this very recent article on the topic of coupling HMC), and then upper bounds on the total variation distance between the chain and its limiting distribution, as in "Estimating Convergence of Markov chains with L-Lag Couplings". Basically the R script starts like a standard script that would use rstan for inference; it runs the default algorithm of Stan for a little while, then extracts some info from the "stanfit" object. With these, a pure R implementation of TV upper bounds for a naive HMC algorithm follows, that relies on functions called "stan_logtarget" and "stan_gradlogtarget" to evaluate the target log-pdf and its gradient.
Unbiased Smoothing using Particle Independent Metropolis-Hastings
Middleton, Lawrence, Deligiannidis, George, Doucet, Arnaud, Jacob, Pierre E.
We consider the approximation of expectations with respect to the distribution of a latent Markov process given noisy measurements. This is known as the smoothing problem and is often approached with particle and Markov chain Monte Carlo (MCMC) methods. These methods provide consistent but biased estimators when run for a finite time. We propose a simple way of coupling two MCMC chains built using Particle Independent Metropolis-Hastings (PIMH) to produce unbiased smoothing estimators. Unbiased estimators are appealing in the context of parallel computing, and facilitate the construction of confidence intervals. The proposed scheme only requires access to off-the-shelf Particle Filters (PF) and is thus easier to implement than recently proposed unbiased smoothers. The approach is demonstrated on a L\'evy-driven stochastic volatility model and a stochastic kinetic model.
Google makes Docs, Drive and Calendar more productive
If you spend your work days toiling in Google's productivity apps, the first thing you might notice today is that Google for Work is now called "G Suite". Once you get past the new label, you might also notice a slew of smart updates across the board that ought to save you time and keep your workflow moving. First up: Docs, Sheets and Slides got a new "Explore" feature that uses natural language search to help you research reports, organize data or design better looking presentations. In each of the main apps, an Explore button brings up a new sidebar with contextual options based on the app you're using. In Docs, this means Explore will search and suggest images, web links or other Drive documents that appear relevant to the content you're writing.
Strategic Information Disclosure to People with Multiple Alternatives
Azaria, Amos (Bar-Ilan University) | Rabinovich, Zinovi (Bar-Ilan University) | Kraus, Sarit (Bar-Ilan University) | Goldman, Claudia V. (General Motors Advanced Technical Center, Israel)
This paper studies how automated agents can persuade humans to behave in certain ways. The motivation behind such agent's behavior resides in the utility function that the agent's designer wants to maximize and which may be different from the user's utility function. Specifically, in the strategic settings studied, the agent provides correct yet partial information about a state of the world that is unknown to the user but relevant to his decision. Persuasion games were designed to study interactions between automated players where one player sends state information to the other to persuade it to behave in a certain way. We show that this game theory based model is not sufficient to model human-agent interactions, since people tend to deviate from the rational choice. We use machine learning to model such deviation in people from this game theory based model. The agent generates a probabilistic description of the world state that maximizes its benefit and presents it to the users. The proposed model was evaluated in an extensive empirical study involving road selection tasks that differ in length, costs and congestion. Results showed that people's behavior indeed deviated significantly from the behavior predicted by the game theory based model. Moreover, the agent developed in our model performed better than an agent that followed the behavior dictated by the game-theoretical models.